68 research outputs found

    AdaBin: Improving Binary Neural Networks with Adaptive Binary Sets

    Full text link
    This paper studies the Binary Neural Networks (BNNs) in which weights and activations are both binarized into 1-bit values, thus greatly reducing the memory usage and computational complexity. Since the modern deep neural networks are of sophisticated design with complex architecture for the accuracy reason, the diversity on distributions of weights and activations is very high. Therefore, the conventional sign function cannot be well used for effectively binarizing full-precision values in BNNs. To this end, we present a simple yet effective approach called AdaBin to adaptively obtain the optimal binary sets {b1,b2}\{b_1, b_2\} (b1,b2∈Rb_1, b_2\in \mathbb{R}) of weights and activations for each layer instead of a fixed set (\textit{i.e.}, {−1,+1}\{-1, +1\}). In this way, the proposed method can better fit different distributions and increase the representation ability of binarized features. In practice, we use the center position and distance of 1-bit values to define a new binary quantization function. For the weights, we propose an equalization method to align the symmetrical center of binary distribution to real-valued distribution, and minimize the Kullback-Leibler divergence of them. Meanwhile, we introduce a gradient-based optimization method to get these two parameters for activations, which are jointly trained in an end-to-end manner. Experimental results on benchmark models and datasets demonstrate that the proposed AdaBin is able to achieve state-of-the-art performance. For instance, we obtain a 66.4% Top-1 accuracy on the ImageNet using ResNet-18 architecture, and a 69.4 mAP on PASCAL VOC using SSD300. The PyTorch code is available at \url{https://github.com/huawei-noah/Efficient-Computing/tree/master/BinaryNetworks/AdaBin} and the MindSpore code is available at \url{https://gitee.com/mindspore/models/tree/master/research/cv/AdaBin}.Comment: ECCV 202

    Design Space Exploration of Neural Network Activation Function Circuits

    Full text link
    The widespread application of artificial neural networks has prompted researchers to experiment with FPGA and customized ASIC designs to speed up their computation. These implementation efforts have generally focused on weight multiplication and signal summation operations, and less on activation functions used in these applications. Yet, efficient hardware implementations of nonlinear activation functions like Exponential Linear Units (ELU), Scaled Exponential Linear Units (SELU), and Hyperbolic Tangent (tanh), are central to designing effective neural network accelerators, since these functions require lots of resources. In this paper, we explore efficient hardware implementations of activation functions using purely combinational circuits, with a focus on two widely used nonlinear activation functions, i.e., SELU and tanh. Our experiments demonstrate that neural networks are generally insensitive to the precision of the activation function. The results also prove that the proposed combinational circuit-based approach is very efficient in terms of speed and area, with negligible accuracy loss on the MNIST, CIFAR-10 and IMAGENET benchmarks. Synopsys Design Compiler synthesis results show that circuit designs for tanh and SELU can save between 3.13-7.69 and 4.45-8:45 area compared to the LUT/memory-based implementations, and can operate at 5.14GHz and 4.52GHz using the 28nm SVT library, respectively. The implementation is available at: https://github.com/ThomasMrY/ActivationFunctionDemo.Comment: 5 pages, 5 figures, 16 conferenc

    Visual Semantic SLAM with Landmarks for Large-Scale Outdoor Environment

    Full text link
    Semantic SLAM is an important field in autonomous driving and intelligent agents, which can enable robots to achieve high-level navigation tasks, obtain simple cognition or reasoning ability and achieve language-based human-robot-interaction. In this paper, we built a system to creat a semantic 3D map by combining 3D point cloud from ORB SLAM with semantic segmentation information from Convolutional Neural Network model PSPNet-101 for large-scale environments. Besides, a new dataset for KITTI sequences has been built, which contains the GPS information and labels of landmarks from Google Map in related streets of the sequences. Moreover, we find a way to associate the real-world landmark with point cloud map and built a topological map based on semantic map.Comment: Accepted by 2019 China Symposium on Cognitive Computing and Hybrid Intelligence(CCHI'19

    Scalable, accurate multicore simulation in the 1000-core era

    Get PDF
    We present HORNET, a parallel, highly configurable, cycle-level multicore simulator based on an ingress-queued worm-hole router NoC architecture. The parallel simulation engine offers cycle-accurate as well as periodic synchronization; while preserving functional accuracy, this permits tradeoffs between perfect timing accuracy and high speed with very good accuracy. When run on 6 separate physical cores on a single die, speedups can exceed a factor of over 5, and when run on a two-die 12-core system with 2-way hyperthreading, speedups exceed 11 ×. Most hardware parameters are configurable, including memory hierarchy, interconnect geometry, bandwidth, crossbar dimensions, and parameters driving power and thermal effects. A highly parametrized table-based NoC design allows a variety of routing and virtual channel allocation algorithms out of the box, ranging from simple DOR routing to complex Valiant, ROMM, or PROM schemes, BSOR, and adaptive routing. HORNET can run in network-only mode using synthetic traffic or traces, directly emulate a MIPS-based multicore, or function as the memory subsystem for native applications executed under the Pin instrumentation tool. HORNET is freely available under the open-source MIT license at http://csg.csail.mit.edu/hornet/

    Atomically Dispersed Pd on Nanodiamond/Graphene Hybrid for Selective Hydrogenation of Acetylene

    Get PDF
    An atomically dispersed palladium (Pd) catalyst supported onto a defective nanodiamond-graphene (ND@G) is reported here for selective hydrogenation of acetylene in the presence of abundant ethylene. The catalyst exhibits remarkable performance for the selective conversion of acetylene to ethylene: high conversion (100%), ethylene selectivity (90%), and good stability (i.e., steady for at least 30 hours). The unique struc-ture of the catalyst (i.e., atomically dispersion of Pd atoms on graphene through Pd-C bond anchoring) ensure the facile desorption of ethylene against the over-hydrogenation of ethylene to undesired ethane, which is the key for the outstanding selectivity of the catalyst

    Tin Assisted Fully Exposed Platinum Clusters Stabilized on Defect-Rich Graphene for Dehydrogenation Reaction

    Get PDF
    Tin assisted fully exposed Pt clusters are fabricated on the core-shell nanodiamond@graphene (ND@G) hybrid support (a-PtSn/ND@G). The obtained atomically dispersed Pt clusters, with an average Pt atom number of 3, were anchored over the ND@Gsupport by the assistance of Sn atoms as a partition agent and through the Pt-C bond between Pt clusters and defect-rich graphene nanoshell. The atomically dispersed Pt clusters guaranteed a full metal availability to the reactants, a high thermal stability, and an optimized adsorption/desorption behavior. It inhibits the side reactions and enhances catalytic performance in direct dehydrogenation of n-butane at a low temperature of 450 °C, leading to \u3e98% selectivity toward olefin products, and the turnover frequency (TOF) of a-PtSn/ND@G is approximately 3.9 times higher than that of the traditional Pt3Sn alloy catalyst supported on Al2O3 (Pt3Sn/Al2O3)
    • …
    corecore